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© Database retrieval system for responding to natural language queries with corresponding tables. 



© An information retrieval system is used for re- from the database on the basis of the database 
trieving information from a database. The information retrieval formula. 

CS retrieval system includes a parser for parsing a 

^ natural language input query into constituent phrases 

^ as a syntax analysis result. The system also irv 

O) eludes a virtual table for converting phrases of the 

IO natural language query to retrieval keys that are 
possessed by the database. The virtual table ac- 

CN counts for particles that modify the phrases in the 

W input query. A collating unit is provided in the sys- 

O tern for preparing a database retrieval formula from 

ft the syntax analysis result by selecting a virtual table 

q] that it is used to convert the phrases to the keys 
possessed by the database. Lastly, the system in- 
cludes a retrieval execution unit for retrieving data ' 
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This invention relates generally to database 
retrieval systems for retrieving information stored in 
a database, and, more particularly, to database 
retrieval systems for retrieving information stored in 
a database using natural language expressions. 

Fig. 1 is a diagram illustrating a conventional 
database retrieval system for retrieving data from a 
table formatted database in response to a natural 
language query. A natural language query is a 
request for data that is set forth in a natural lan- 
guage, such as English, Japanese, French, etc. 
The illustrated database retrieval system is de- 
scribed in more detail in "Kinukawa, A Natural 
Language Interface Processor Based on the 
HierarchicaJ-Tree Structure Model of Relation Ta- 
ble. Journal of Information Processing Society of 
Japan , Vol.27, No.5 (1986), pp.499-509." This sys- 
tem is designed to process queries in Japanese. 
For the examples described below, the English 
translations of Japanese words and phrases are 
provided in parenthesis. 

The database retrieval system shown in Fig. 1 
includes an input unit 2, such as a keyboard, for 
entering a natural language query 1. The system 
also includes a communications controller 3 for 
forwarding the natural language query 1 to a re- 
trieval sentence analysis unit 5. The retrieval sen- 
tence analysis unit 5 processes the input query 1 
to produce a hierarchical model of the query. The 
system additionally includes a word dictionary 4, 
that is constructed on the basis of the content of a 
database 9, and a hierarchical table model 6 for 
hierarchically expressing the content of the 
database. The dictionary 4 and hierarchical table 
model 6 are used by the retrieval sentence analy- 
sis unit 5 in analyzing the natural language query 
1. The retrieval sentence analysis unit 5 performs 
both vocabulary analysis and syntactic/semantic 
analysis on the natural language query 1. The 
retrieval sentence analysis unit 5 produces a re- 
trieval sentence analysis result 7 as output that is 
forwarded to a retrieval processing unit 8. The 
retrieval processing unit 8 uses the retrieval sen- 
tence analysis result 7 to retrieve data from the 
database 9. 

The depiction of the conventional database re- 
trieval system shown in Fig. 1 is a functional de- 
scription intended to show the interaction between 
the respective components of the system. The 
components shown in Rg. 1 are, in fact, imple- 
mented in a data processing system 10, such as 
that shown in Rg. 2. The data processing system 
10 includes a central processing unit (CPU) 11, a 
memory 12, the communications controller 3, an 
output device 17 and the input unit 2. Each of 
these components is coupled to a bus 13. The 
retrieval sentence analysis unit 5 and the retrieval 
processing unit 8 are implemented in software that 



is executed by the CPU 11 (Rg. 2). The software is 
stored in the memory 12. The word dictionary 4 
(Rg. 1), the hierarchical model table 6 and the 
database 9 are stored within the memory 12 (Rg. 
5 2). 

Fig. 3a provides a more detailed depiction of 
an example of the word dictionary 4. As this Rg. 
shows, the dictionary includes a plurality of entries, 
and each entry includes three fields. The header 

io field identifies the term or phrase associated with 
the entry, whereas the part of speech field iden- 
tifies the part of speech of the term or phrase. 
Lastly, the type field identifies the type of term or 
phrase that is used. In the example shown in Rg. 

75 3a, the types are "item name" and "data expres- 
sion word". 

Rg. 3b provides a more detailed depiction of 
the hierarchical table model 6. This model 6 sets 
forth the hierarchical relationship between the re- 

20 spective tables. Each table specifies a number of 
attributes. For instance, table 14 includes the at- 
tributes of "date", "commodity code", "commodity 
group code", and "sales". The "commodity code" 
attribute is also an attribute in table 16, which is 

25 hierarchically related with table 14. Similarly, the 
attribute of "commodity group code" is an attribute 
of both table 16 and table 18. The table 14 is a 
higher order table than tables 16 and 18. Moreover, 
table 16 is a higher order than table 18. This 

30 hierarchical table model is consistent with the rela- 
tional model for data proposed by E. F. Todd in "A 
Relational Model of Data for Large Shared Data 
Banks," Communications of the ACM , June 1970, 
pp. 377-387. 

35 Table 3c provides illustration of the database 9. 

The database 9 includes table A, table B and table 
C. Each of the tables A, B, C includes different 
types of information. For example, table A contains 
sales information, table B includes commodity in- 

40 formation, and table C includes commodity group 
information. These tables are used in conjunction 
to obtain information requested by the natural lan- 
guage query 1 (Rg. 1). 

Operation of the system shown in Rg. 1 will 

45 now be described. Initially, a natural language que- 
ry 1 is entered using the input unit 2. When a 
keyboard is used as the input unit 2, the query is 
entered simply by typing the query. The query 1 is 
then passed to the conversation control unit 3, 

so which forwards the query to the retrieval sentence 
analysis unit 5. The retrieval sentence analysis unit 
5 parses the query into a hierarchical structure of 
words or phrases that is output as the retrieval 
sentence analysis result 7. In processing the query, 

55 the retrieval sentence analysis unit 5 first chops the 
query into words or phrases. In the present exam- 
ple, the query is chopped into the phrases 
"chokoreeto rui" and "uriage". The terms "no" and 
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"ha" are zyoshi, whose significance will be de- 
scribed in more detail below. 

Once the query has been divided into words or 
phrases, vocabulary analysis is performed on the 
words or phrase to determine what each word or 
phrase in the query signifies. In performing such 
vocabulary analysis, the retrieval sentenced analy- 
sis unit 5 references the word dictionary 4 to deter- 
mine that "chokoreeto rui" (chocolates and the like) 
is a data expression word (see Fig. 3a). The re- 
trieval sentence analysis unit 5 also determines 
that "uriage" (sales) is an attribute item name, 
respectively. The word dictionary 4 indicates that 
both of these phrases are nouns. The dictionary 2 
is not referenced for the zyoshi "ha" and "no". 

Syntax and semantic analysis is then per- 
formed on the query. In particular, syntactic analy- 
sis is performed to process the syntax or the query 
in order to understand the role each phrase serves 
in the query. Semantic analysis, on the other hand, 
is performed to understand what is being request- 
ed by the query. 

Subsequently, semantic analysis is performed 
to relate the meaning of the query to the database 
entries. The semantic analysis relies on the hierar- 
chical table model 6 (see Fig. 3b) to ascertain that 
"chokoreeto rui" (chocolates and the like) is an 
attribute data expression word of a commodity 
group in table 18 (i.e., table C in Fig. 3c) and 
"uriage" (sales) is an Hern name in the table 14 
(i.e. table A in Fig. 13c). Moreover, the hierarchical 
table model 6 (Fig. 3b) indicates that table 14 is a 
higher order table than table 18. Since the attribute 
item appearing in the low order table is a noun, 
and a zyoshi "no" is added thereto, it is recognized 
that the attribute "chokoreeto rui" in table 18 modi- 
fies the attribute "uriage" (sales), which appears in 
a higher order table 14. Using these results, a 
retrieval formula "retrieval condition: (commodity 
group name = chokoreeto rui), retrieval object: 
uriage" is obtained and is output from the retrieval 
sentence analysis unit 5. Subsequently, retrieval 
from the database 9 is performed by the retrieval 
processing unit 8 to obtain the desired data. 

Figs. 4a, 4b and 4c show dictionaries used in a 
second conventional database retrieval system, as 
disclosed in Japanese Patent Laid-Open Publica- 
tion No. 59-99539. In these dictionaries, information 
on column name in a file, information on data item 
name, and information on a file name that pos- 
sesses a common column name or data name, are 
stored according to file names of a data file that is 
contained in a database. Fig. 4a represents a dic- 
tionary in which one of the database files contains 
the column name of a file. The dictionary also 
holds information regarding the order in which the 
column is contained in the file and additionally 
holds information regarding synonyms of the col- 



umn name (i.e., file numbers and column attribute 
numbers of columns that are synonymous with the 
named column). Rg. 4b shows an analogous dic- 
tionary in which one of the files contains a data 

s column name, and the dictionary stores a position 
at which the named column is contained in the file. 
Lastly, the dictionary stores information regarding 
synonyms of the data column name. Fig. 4c shows 
a dictionary holding information as to semantically 

10 identical data columns that are connected as syn- 
onyms. 

Rg. 5 is the designated format for input que- 
ries for the second conventional system. This for- 
mat requires that queries be entered as a number 

is of entries, wherein each entry includes two fields; a 
noun filed and a particle or auxiliary field. Thus, for 
the example query 1 (Rg. 1) used in the discussion 
of the first conventional system, the input query for 
the second conventional system would be as fol- 

20 lows. The first noun field would be entered as 
"chokoreeto rui" and the corresponding particle 
field would be entered as "no". Further, the second 
noun field would be entered as "uriage" and the 
particle field would be entered as "ha". 

25, In this second conventional system, queries in 
a natural Japanese format cannot be analyzed. 
Likewise, the retrieval object is determined in view 
of the restriction of the designated format shown in 
Rg. 5. A pertinent data file may, thus, be accessed 

30 only by limited terminology including synonyms 
recorded in the dictionaries. 

In the first conventional information retrieval 
system described above, it is necessary to have 
previously constructed a hierarchical table model. 

36 Since, however, in general, it is not always possible 
to place the content of a database into a hierarchy, 
input sentences which do not fall under the defined 
hierarchical structure cannot be processed. Further, 
there is no flexibility in receiving natural language 

40 phrases or words, such as "sengetsu" (last month) 
which are not in the database. The system is 
limited solely to the phrases included in the 
database. Still further no information is provided on 
"zyoshi" (particles). Thus, there is also the problem 

45 that the ommission of a "zyoshi" cannot be de- 
tected. 

In addition, when there is an ambiguous word 
(for example, time periods or seasons), syntactic 
analysis is impossible unless the definition of the 

so ambiguous word is recorded in detail. In some 
cases, each interrogator must record the definition 
on an individual basis according to his usage of the 
ambiguous term. 

Information retrieval is performed for each of 

55 the items recorded in a file. Thus, an answer can- 
not be obtained for a question in which a plurality 
of files are retrieved as a result of analyzing the 
input sentence and in which it is necessary to 
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process such a retrieval result to obtain a final 
result. 

The foregoing problems in the prior art are 
overcome by the present invention of an informa- 
tion retrieval system. The information retrieval sys- 
tem of the present invention is used for retrieving 
information from a database. The information re- 
trieval system includes a parser for parsing a natu- 
ral language input query into constituent phrases. 
The parser outputs a syntax analysis result. The 
system also includes a virtual table for converting 
phrases of the natural language query to retrieval 
keys that are possessed by the database. The 
virtual table accounts for particles that modify the 
phrases in the input query. A collating unit is pro- 
vided in the system for preparing a database re- 
trieval formula from the syntax analysis result by 
selecting a virtual table that it is used to convert 
the phrases to the keys possessed by the 
database. Lastly, the system includes a retrieval 
execution unit for retrieving data from the database 
on the basis of the database retrieval formula. 

The information retrieval system may also in- 
clude an additional table for converting an undeter- 
mined value phrase in the natural language query 
into a determined value phrase in the database 
based on the syntax analysis result. Still further, 
the information retrieval system may include a ter- 
minology dictionary for identifying entries in the 
virtual table that are to be used in converting 
phrases of the natural language query. The dic- 
tionary includes words representing times and the 
dictionary is used by the parser in obtaining the 
syntax analysis result. When the terminology dic- 
tionary is used, the system may also include a time 
interval definition table in the virtual table for defin- 
ing dates corresponding to words representing 
time. Lastly, the system may include a database 
retrieval formula conversion unit for generating a 
formula in a database retrieval language from the 
database retrieval formula. 

Fig. 1 is a block diagram of a first conventional 
database retrieval system illustrating the pro- 
cessing performed by the system; 
Fig. 2 is a block diagram of a data processing 
system suitable for implementing the first con- 
ventional system; 

Fig. 3a is a more detailed depiction of the word 
dictionary 4 of Fig. 1; 

Fig. 3b is a more detailed depiction of the 
hierarchical table model 6 of Fig. 1; 
Fig. 3c is a more detailed depiction of the 
database 9 of Rg. 1; 

Figs. 4a - 4c illustrate dictionaries in a second 
conventional database retrieval system; 
Rg. 5 illustrates the input format for queries with 
the second conventional database retrieval sys- 
tem; 



Rg. 6 is a block diagram of an embodiment of 
the present invention illustrating the processing 
performed by the embodiment; 
Rg. 7 is a more detailed depiction of the ter- 

5 minology dictionary 26 of Rg. 6; 

Rgs. 8a - 8c are more detailed depictions of 
tables held in the virtual table 28 of Rg. 6; 
Rg. 9 is an illustration of a syntax tree that is 
output by the parser 22; 

w Rg. 10 is a flowchart of steps performed by the 
system and processing a natural language que- 
ry; 

Rg. 11 is a more detailed depiction of a defini- 
tion table in the virtual table 28; 
75 Rg. 12 is a depiction of an example natural 
language correspondence logic formula; 
Rg. 13 is a depiction of the modified version of 
the formula of Rg. 12 

Rg. 14 is a more detailed depiction of the 

20 collating unit 30 of Rg. 6; 

Rg. 15 is a depiction of a Definition Table A in 
the virtual table 28 of Rg. 6; 
Rgs. 16a and 16b are diagrams illustrating the 
operation of the system with a query that em- 

25 ploys the seasonal time period; 

Rgs. 17a - 17c illustrate the processing of an 
entity table logic formula; 
Fig. 18 is a depiction of a database retrieval 
word grammar definition table 155 that is con- 

30 tained in the virtual table 28 of Rg. 6; 

Rg. 19 is an example of a database retrieval 
formula processing for the entity table logic for- 
mula of Rgs. 17a - 17c; 

Rgs. 20a and 20b illustrate the grouping in 
35 syntactic trees of two complex queries; and 

Rgs. 21a and 21b depict additional virtual tables 
employed for the processing of the queries of 
Rgs. 20a and 20b. 

A preferred embodiment of the present inven- 

40 tion will now be described with reference to the 
drawings. Rg. 6 shows the construction and flow of 
processing of a first preferred embodiment of the 
present invention which provides a database re- 
trieval system that responds to a natural language 

45 query 1 . Like the first conventional system of Rg. 
1, the system may be implemented on a data 
processing system as shown in Rg. 2. This first 
preferred embodiment includes an input unit 2, a 
conversation control unit 3 and a database 9 like 

so that employed in the conventional system of Rg. 1. 
These components are implemented in the data 
processing system 2 as discussed for the first 
conventional system. The preferred embodiment, 
however, differs from the conventional system in 

55 several respects. These distinctions are highlighted 
below. 

The first preferred embodiment also includes a 
parser 22 for parsing an input natural language 
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query into its constituent parts. The parser 22 uses 
a grammar table 24 and a terminology dictionary 
26. The grammar table 24 holds information for 
regulating the relation in a Japanese sentence, and 
the terminology dictionary 26 defines the part of 
speech and meaning of each word in the query 22. 
While the terminology dictionary 26 is similar to the 
conventional word dictionary 4 shown in Fig. 1, the 
terminology dictionary of Fig. 6 differs in that is 
includes a column for a semantic marker (see Fig. 
7). The role of the semantic marker is described in 
more detail below. A column for a semantic ID (see 
Fig. 7) and a column for a correspondence item are 
also provided. The parser analyzes the input query 
22 to determine the subject, predicates and other 
parts of speech in the input natural language query 
22. 

The system of Fig. 6 differs substantially from 
the conventional system of Pig. 1 in that the sys- 
tem of Fig. 6 includes a virtual table 28. The virtual 
table is a natural language conversion virtual table 
held in memory 12 (Fig. 2), for designating which 
table in the database 9 is to be searched to find 
the data requested in the query 22. 

In general, there are two types of data in the 
database 9. There is fixed data, such as a master 
file for defining "object", and there is variable data, 
which continuously changes in accordance with 
"event". Variable data is also referred to as a 
cumulative file. Fixed data having the same char- 
acteristics are grouped to form a virtual table. Fur- 
ther, a virtual table is formed by adding variable 
data to those fixed data items which are strongly 
related thereto. 

The virtual table 28 is composed of a number 
of tables (i.e. tables 1 - 8) as shown in Figs. 8a - 
8c. Each one of the entries in these tables includes 
a field for a "surface restriction" (see Figs. 8a - 8c) 
and a field for a "correspondence attribute" is 
included for each entry. The surface restriction field 
is filled with data only for variable data. The surface 
restriction field is used to store particles which 
modify each header word of the input natural lan- 
guage and which determine the value of the 
"correspondence attribute" in combination with the 
header word. That is, the surface restriction is an 
item that is provided for performing a further selec- 
tion when a plurality of corresponding attributes are 
possible for a header word. 

The correspondence attribute may designate 
another virtual table, a database entity table, or an 
operation entity table. Designation of another virtual 
table indicates that detailed data are stored in the 
other table. Further, the storage in this fashion is 
used in an algorithm for selecting a virtual table. 
Specifically, if a virtual table is designated in a 
correspondence attribute field, the designated vir- 
tual table is selected with priority. 



The system of Fig. 6 also includes a collating 
unit for retrieving data from the database 9 by 
referencing the virtual table 28 using the analysis 
result that is output from the parser 5. The collating 
5 unit may be implemented in software that is ex- 
ecuted by the CPU 11 (Fig. 2) and stored in 
memory 12. 

The system further includes a database for- 
mula generation unit 32 for converting an entity 

70 table logic formula from the collating unit into a 
database retrieval formula. The database retrieval 
formula is used by a retrieval unit that retrieves 
data from the database 9. 

Terms such as "no" and "ha" in the input 

75 natural language query 20 are zyoshi. In Japanese, 
these terms serve to identify the role served by the 
words that precede them. For instance, in the ex- 
ample natural language input query 20 shown in 
Fig. 6, the zyoshi "no" modifies the phrase 

20 "Chokoreeto rui" (chocolates and the like) to in- 
dicate that "Chokoreeto rui" is the object of a 
prepositional phrase. Similarly, the zyoshi "no" fol- 
lows the word "sengetsu" to indicate that 
"sengetsu" is the object of a prepositional phrase. 

25 Lastly, the zyoshi "ha" modifies the term "uriage" 
(sales) to indicate that "uriage" is the subject of the 
query. The zyoshi help to construct the hierarchical 
model shown in Fig. 9 that is output from the 
parser 22. 

30 Before discussing the operation of this system 

in detail, it is helpful to provide an overview of 
operation of the system. Initially, the natural lan- 
guage input query 20 (Fig. 6) is input by the input 
unit 2 and received by the communications control- 

35 ler 3. The communication controller directs the 
input query to the parser 22. The grammar table 24 
is used by the parser 22 to examine grammatical 
rules that help to parse the table into an appro- 
priate syntax tree like that shown in Rg. 9. The 

40 parser 22 also uses the terminology dictionary 26 
to determine which of the tables in the virtual table 
28 should be examined. Specifically, the "item" 
column of the terminology dictionary, as shown in 
Rg. 7, is examined. 

45 The collating unit 30 (Rg. 6) then determines 
which of the tables in the virtual table 28 will be 
utilized. For the example of natural language query 
20, table 1 (see Rg. 8a) is examined. The entries 
for the corresponding terms are examined in the 

so table. The correspondence attribute field of the 
entries specify the table in the database 9 (Fig. 6) 
and entry where information regarding the term of 
interest may be found, another correspondence 
table or an indication that the desired data is cah 

55 culated as a mathematical function. The information 
retrieved by the collating unit 30 (i.e., the entity 
table logical formula) then is passed onto the 
database formula generation unit 32 that converts 
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this information into a database retrieval formula for 
retrieving from the database. The database retrieval 
formula is passed from the database formula gen- 
eration unit 32 to the retrieval unit 34, which re- 
trieves the appropriate data from the database 9. 
The retrieved data is then output to the output 
device 17 (Fig. 2). 

The operation of the system of Rg. 6 will now 
be described in detail. Initially, a natural language 
query 1 "Chokoreeto rui no sengetsu no uriage 
ha?" (Sales of chocolates and the like in the last 
month?) is entered using the input unit 2. The 
communications controller 3 passes this query to 
the parser 22. Retrieval order and operation order 
of the retrieval language are defined at the commu- 
nications controller 3. The parser 22 parses the 
query according to known strategies for parsing 
Japanese queries to produce a syntax analysis 
result (like syntax tree shown in Rg. 9). The parser 
5 uses the grammar table 34 and the terminology 
dictionary 26 in performing its parsing. The gram- 
mar table 24 is a set of extended context-free 
grammatical rules such as outlined in "Iwanami 
Koza, Zyoho Kagaku 23: Kazu to Shiki to Bun no 
Shori", Chapter 5 'Kikai Honyaku\ Iwanami 
Shoten". 

The terminology dictionary 26 also has a for- 
mat as outlined in the above described article. This 
format is shown in Rg. 7. To eliminate ambiguities 
in the meaning of a word, a semantic ID is given to 
each word. The semantic ID helps to associate the 
input term or words with term or words that are 
understandable to the database 9 (Rg. 6). For 
example, since there is no retrieval key for 
"shoohin" (commodity), "shoohin mei" (commodity 
name) is designated as the semantic ID for 
"shoohin". The database 9 (Rg. 6) includes in- 
formation regarding the commodity name. Analo- 
gously, since there is no entry for "choko rui" 
(chocolates and the like) in the database, 
"chokoreeto rui" (chocolates and the like) is des- 
ignated as its semantic ID. 

Each entry in the terminology dictionary 26 
(Rg. 7) also includes a semantic marker. The se- 
mantic marker is provided to connect an ambigu- 
ous word (i.e., not directly defined in the virtual 
table) to a correspondence attribute. Further, the 
semantic marker serves to combine words that are 
identical under the semantic restriction in the virtual 
table. For example, since there are no such re- 
trieval keys for "sengetsu" (last month) in the vir- 
tual table 28 (Fig. 6), the semantic marker for this 
term is month (date), hence, indicating that this 
term is an indication of date on a monthly basis. 
Similarly, the term "Kyonen Oast year), "hi" (day) 
and "toshi" (year) are also assigned semantic 
markers that indicate that the terms refer to date. A 
plurality of semantic markers may be allowed for a 



word (e.g. "uriage" in Rg. 7). In such instances, 
the item in the virtual table 28 (Rg. 6) that is 
capable of corresponding to a retrieval key of the 
database 9 is searched by following semantic re- 
5 striction on the virtual table designated by the 
semantic marker. Further, in the terminology dic- 
tionary 26, a column for corresponding items (e.g. 
the "ITEM" column in Rg. 7) is provided for des- 
ignating which one of the tables of the virtual table 
10 28 (Rg. 6) should be referenced. 

Furthermore, in the case wherein the term, for 
which a terminology dictionary entry is sought, is a 
numerical value having no corresponding virtual 
table entry, a correspondence attribute is deter- 
75 mined by the modifying-modified relation thereof or 
a semantic marker for units of numerical values. 
Alternatively, an actual value is determined in ac- 
cordance with the definition of an entity table. 
As a result of the analysis, performed by the 

20 parser 22, the construction of the query is iden- 
tified and the object of the interrogation is known. It 
is necessary to conform the object of interrogation 
to an item possessed by the database. While sev- 
eral methods may be employed for this purpose, 

25 the most effective method is one in which the 
virtual table is provided to associate similar mean- 
ings which are referenced as different words in the 
database. By providing a virtual table, alteration 
and/or addition of the system is easy compared to 

30 a method in which the retrieval object item of the 
database is directly entered into a terminology dic- 
tionary. Further, a variety of different natural Japa- 
nese queries may be correctly processed and the 
queries may employ various different modifier re- 

35 presentations. 

The parser 22 (Rg. 6), thus, produces a hierar- 
chical syntax tree like that shown in Rg. 9. This 
result indicates that the sales (i.e. "uriage") are 
what is sought. The term "Chokoreeto rui" 

40 (chocolate and the like) specifies the commodity 
group for which sales are sought, and the term 
"sengetsu" (last month) indicates the time frame 
for which the sales data is sought. This syntax tree 
is passed to the collating unit as the syntax analy- 

45 sis result (see step 40 in Rg. 10). The syntax tree 
is not directly converted into a database retrieving 
logic formula, but rather is converted into an inter- 
mediate representation known as a virtual table 
logic, formula. Then an appropriate table in the 

so virtual table 28 (Rg. 6) is selected (step 42 in Rg. 
10). 

For the example query 20 of Rg. 6, the ter- 
minology dictionary 26 (Rg. 7) is referenced. Spe- 
cifically, the "item" field is examined for 
55 "sengetsu" (last month). The item field points to 
Table 5 in the virtual table 28 (Rg. 6). Thus, Table 
5 (Rg. 8c) in the virtual table 28 (Rg. 6) is exam- 
ined. The entry for "sengetsu" has a correspon- 
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dence attribute pointing to Definition Table B-21. 
Accordingly, the entry with argument 21 in Defini- 
tion Table B is examined (see Fig. 11a). This table 
entry sets forth the method of calculation for 
"sengetsu". "sengetsu" (the last month) is a value 
which varies according to the point in time of input 
and, therefore, must be calculated. 

In order to understand the method, it is impor- 
tant to first understand the format in which the date 
is held. The current data is an 8 decimal digit 
number with digits 8-5 holding the year (e.g. 
"1992"), digits 4 and 3 holding the month (e.g. 
"07", for July) and bits 2 and 1 holding the date 
(e.g. "11 "). Thus, an example format for the date of 
July 11, 1992 is "19920711". 

If July 11, 1992 is the current date, the Defini- 
tion Table B tells the system how to calculate the 
last month (i.e. June or "06"). First one is sub- 
tracted from the month digits 4 and 3. Hence, a 
result of (07-1) or 06 is obtained. Then, the system 
checks whether the result is 00. In this case, the 
result is not zero. If the result of the subtraction is 
00, it is an indication that the last month was 
December of the previous year. Therefore, the 
month digits 4 and 3 are replaced with the digit 12 
for December, and the year digits 8-5 (the high 
order digits) are decremented by one. Lastly, the 
day digits 1 and 2 are replaced with 00. 

Next, a table in the virtual table 28 (Fig. 6) for 
"sengetsu" (last month) is selected. In the terminol- 
ogy dictionary 26 (Fig. 7), a plurality of virtual 
tables are designated for "chokoreeto rui" 
(chocolates and the like). Specifically, Tables 1 and 
3 are designated. An entry in the terminology dic- 
tionary 28 is also examined for the term "uriage" 
(sales). The entry for "uriage" (sales) designates 
Table 1. Given that both the entry for "Chokoreeto 
rui" and the entry for "uriage" specify Table 1 of 
the virtual table 28, Table 1 is selected. Once the 
appropriate table in the virtual table 28 is selected, 
an intermediate representation is formed by the 
collating process (step 44 in Fig. 10) performed by 
the collating unit 30. 

The collating unit 30 (Fig. 14) internally com- 
prises: a virtual table selection unit 60, for selecting 
a table in the virtual table 28 (Fig. 6); an actual 
value calculation/combination unit 62 (Fig. 14) for 
performing calculations and combination; and an 
interrogative structure determining unit 64 for de- 
termining the structure of interrogations that are 
passed to the database formula generation unit 32. 

The collating process involves incorporating 
the contents of a dictionary referenced by the input 
natural language query into the table of the virtual 
table that was selected at step 42 in Fig. 14 or by 
performing attribute coupling between virtual ta- 
bles. In the example case, two virtual tables have 
been selected: Table 1 (by the entries in the ter- 



minology dictionary for "uriage" and "Chokoreeto 
rui") and Table 5 (by the entry for "sengetsu"). A 
natural language correspondence logic formula 50 
is generated as shown in Fig. 12. The correspon- 

5 dence logic formula 50 is a table that sets forth 
what information is known from the query and what 
additional information is needed to complete the 
query. Specifically, it sets forth the relevant vari- 
ables and any values of these variables that are 

w known. 

"Chokoreeto rui" is entered in the "shoohin 
gun mei" (commodity group name) in the formula 
50 as "chokoreeto rui" (chocolates and the like) is 
a commodity group name. This is known from the 

75 first table in the virtual table 28 (Fig. 6). Further 
"URI" and "date" are variables for which the val- 
ues are not yet determined. Those variables repre- 
sented by the same word have the same value and 
represent that same attribute. In this example, 

20 "URI" in the question and "URI" in "uriage hyo" 
are identical to each other. Note that values for 
those items other than the necessary items are not 
needed. A mark "*" indicates that no value is 
entered. 

25 In step 46 of Fig. 10, a necessary virtual table 
is added to access the database 9 (Fig. 6). In this 
example, table 3 (Fig. 8b) of the virtual table 28 
(Fig. 6) is selected based on correspondence at- 
tribute of "shoohin gun mei" 7 (commodity group 

30 name) in table 1 (Fig. 8a), which specifies Table 3- 
2. The entry in table 3 directs the user to Database 
Table entry 3-2 (e.g. DB 3-2). In addition, the actual 
value of "sengetsu" (last month) is calculated from 
the Definition Table B (as was discussed above). 

35 The table, thus, provided is indicated by 52 in Fig. 
13. The data shown assumes that the current date 
is in May 1990. Hence, the last month is April 1990 
or "19900400". The commodity group code serves 
as the attribute for connecting Table 1 and the 

40 commodity group master table, and it possesses 
"Code" as an undetermined variable. 

This table 52 is converted into a database 
retrieval formula by the database formula genera- 
tion unit 32 (Fig. 6) at step 48 (Fig. 10). Retrievals 

4S are performed sequentially by the retrieval unit 34 
(Fig. 6) based on the retrieval formula to fill the 
undetermined variables in the table 52 (Fig. 13). 
First, the undetermined variable "Code" is deter- 
mined from commodity group master table 19 (i e., 

so table C in Fig. 3c) to be 200, which corresponds to 
"chokoreeto rui" (chocolates and the like). 

The system then looks to the correspondence 
attribute for "uriage" (sales) (see Fig. 8a), which is 
"fun-sum (BB1-4)". The symbol "fun" indicates 

55 that some kind of calculation is needed. With the 
definition table B-21, if for example the last month 
is April of 1990, the value for the last month is 
obtained from the value for the current date as an 
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operation result "19900400". In a similar manner, 
fun-sum (DB1-4) is an operation for obtaining the 
sum of the numerical values on the sales column 
(column 4) in Table A of the database (Fig. 3c). 
The system then may access Table A to sum all 
the sales entries in the sales column for com- 
modity group code 200 items during the month of 
April 1990. 

In this manner, the value of URt is filled and 
the database retrieval processing is terminated. 
The result is then o inputted in a predetermined 
format. 

The query must be converted into a query set 
forth in a database retrieval language to retrieve 
data from the database. To replace the structure of 
the Japanese natural language query with database 
retrieval formulas, it is necessary to put together 
the restrictions and grammar possessed by the 
database retrieval language in the terminology defi- 
nition table 26 (Fig. 6). Construction of the queries 
in the database retrieval language are made by 
referring to this terminology definition table as de- 
scribed above. Further, having a separate grammar 
definition table 24 produces the advantage that all 
the changes to the database retrieval language 
may be absorbed by the grammar definition table, 
even when the present invention is applied to a 
system using a different database retrieval lan- 
guage. 

As described above, by using the semantic 
marker of a terminology dictionary and the virtual 
table, a database is designated and a conversion is 
made into a retrieval logic formula which is suitable 
even when an ambiguous word is included in the 
query or an omission occurs in the input query. 

As described, in the present invention, no hier- 
archical table model is needed. Further, no consid- 
eration of the hierarchical relation of the database 
is needed. Since the virtual tables have construc- 
tion which directly reflects the hierarchical relation 
of database, construction and alteration is easy. 
Further, since the surface restriction and the se- 
mantic restriction are included in the virtual table, 
the collating unit can designate a highly probable 
database file by selecting a suitable virtual table 
even for an ambiguous input query. 

In the above described example, the term 
"sengetsu" (last month) was included in the natural 
language query. This term was an ambiguous word 
related to time. Trie system also has the capability 
of properly analyzing other ambiguous terms relat- 
ing to time. Suppose that the Japanese input sen- 
tence is "Kotoshi no haru no uriage ha" (Sale for 
the spring of this year?). The parser 22 (Fig. 6) 
decomposes this sentence into its constituent part 
"uriage" (sales) and "kotoshi no haru" (the spring 
of this year). Further, the parser 22 knows that 
"kotoshi no haru" modifies "uriage". The parser 22 



looks up the term "kotoshi no haru" in the terminol- 
ogy dictionary 26 and is directed to an appropriate 
table in the virtual table 28. The entry in the virtual 
table directs the user to entry 3 in Definition Table 

5 A as shown in Fig. 15. This entry indicates that 
spring extends from 03/01 to 05/31. In this manner, 
the word "kotoshi no haru" (the spring of this year) 
contained in the syntax analysis result is replaced 
by "1990 nen 3 gatsu 1 nichi - 1990 nen 5 gatsu 

w 31 nichi" (March 1 1990 - May 31 1990). 

In this example, however, any combination of 
time words to be used must be recorded on a 
terminology dictionary as a single word. For exam- 
ple, when it is desired that "kotoshi" (this year) and 

75 "haru" (spring) be combined "kotoshi no haru" (the 
spring of this year), it is necessary to previously 
record "kotoshi no haru" (the spring of this year) in 
the terminology dictionary 26 (Fig. 6). Further, , 
since the definition of a seasonal word or the like 

20 differs from user to user, a terminology dictionary 
must be prepared for each user. 

As such, an alternative embodiment as shown 
in Figs. 16a and 16b may be employed. This 
alternative embodiment differs from the first em- 

25 bodiment in that it includes: a point in time calcula- 
tion unit 70, for calculating a specific point in time 
from the current date, a time interval definition 
table reference unit 80, and a combining unit 82 for 
adding the reference result of the time interval 

30 definition table reference unit 80 and the calculated 
result of a point in time. Further, a system timer 68 
is provided. 

Suppose that "sakunen no fuyu no uriage ha" 
(Sales during the winter of the last year?) is en- 

35 tered from the input unit 2 as the input query 66 
(Fig. 16a). The parser 22 generate a syntax analy- 
sis result 72 (i.e., a syntax tree) by employing the 
grammar table 24 and the terminology dictionary 
26. Trie syntax analysis result contains "sakunen" 

40 (last year) and "fuyu" (winter), which are time 
words. The definition of the word "sakunen" (the 
last year) is obtained by time calculation, and the 
definition of the word "fuyu" (winter) is designated 
to be described in the time interval definition table 

45 82 (Fig. 16b). 

The syntax analysis result 72 is passed to the 
collating unit 30, where the result is received by 
the point in time calculation unit 70. At the point in 
time calculation unit 70, a point in time calculation 

so is performed with respect to the current date (e.g., 
"19901224") that is obtained by a system timer 68. 
The actual calculation method performed is se- 
lected from the definition provided in Definition 
Table B in Fig. 11. The definition that is chosen 

55 depends on the value in the argument column in 
the terminology dictionary. In this example, an 8- 
digit integer value indicating the year "sakunen" 
(last year), "19890000", is obtained from the cal- 
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culation method, corresponding to the value "11" 
in the argument column of "sakunen" (the last 
year), which states, "Subtract 1 from the four high 
order digits and replace the four low order digits 
with "0000". Subsequently, the calculated integer 
value is substituted for the portion of "sakunen" 
(the last year) in the syntax analysis resutt 72 to 
obtain a point in time calculation result 74. 

The time interval definition reference unit 80 
contains the actual dates corresponding to "fuyu" 
(winter). It obtains these dates by referring the time 
interval definition table 84. Hence, as shown in Fig. 
15, "fuyu" is defined as starting at "00001201" 
(i.e., December 1) and ending at "00010331" i.e., 
March 31 of the next year). The time interval defini- 
tion table reference unit 80 substitutes the retrieved 
value 86 for "fuyu" (winter) in the point in time 
calculation result 24 to obtain a time interval defini- 
tion table reference result 76. 

The combining unit 82 combines the actual 
dates corresponding to "sakunen" (the last year) 
and "fuyu" (winter) by addition to obtain a com- 
plete 8 digit range for dates for the interval as 
shown in the calculation result 78. Specifically, the 
year "19890000" is added to the dates of "fuyu" 
"00001201" - "00010331" to obtain "19891201" - 
"19900331". The calculation result "19891201- 
19900331" means "from December 1, 1989 to 
March 31, 1990". The calculation result 78 is then 
processed as discussed in the first embodiment. 

By changing the definition of each time word 
described in the time interval definition table 84 
(Fig. 16b), the user may obtain a calculation result 
in accordance with definition without altering the 
terminology dictionary 26 (Fig. 16a). That is, it is 
possible for users to share a terminology dictionary 
and manage the time interval definition table in- 
dividually. This benefit of sharing a terminology 
dictionary is more apparent when it is appreciated 
that a terminology dictionary is large in size and 
amendment of a terminology dictionary is difficult. 
Moreover, if words containing many modifiers are 
to be defined, storage requirements are large. 
Hence, providing a separate terminology dictionary 
for every user is cumbersome. 

The example input natural language queries 1 
(Fig. 6) and 66 (Fig. 16a) requested sales informa- 
tion that could be readily reproduced by the sys- 
tem. The system, however, is capable of handling 
more sophisticated queries that require reasoning. 
For example, suppose that the Japanese input que- 
ry is a sentence "Sengetsu no uriage yori kongetsu 
no uriage ga ooi tokuisaki ha" (What customer had 
more sales in this month than sales in the last 
month?). For such an input natural language query, 
the system produces a retrieving logic formula, 
also known as the entity table logic formula 14, in 
the form 140 shown in Fig. 17a. The formula 140 



includes a result table 142 for storing the final 
results of the retrieved data. The result table 142 
includes a location for storing the customer's name 
and tables for storing the total sales of this month 
5 and the total sales of last month. In addition, the 
entity table logic formula 140 includes a GT table, 
which is a table in the virtual table that performs a 
logical operation on parameters to determine if one 
parameter (the left side) is greater than the other 
10 (the right side). 

The total sales of the last month table includes 
a pointer pointing to a last month's intermediate 
result table 144 that holds the results of intermedi- 
ate calculations that are necessary to determine 
rs the total sales of the last month. Similarly, the total 
sales of this month's table points to this month's 
intermediate result table 146. Both of the intermedi- 
ate result tables 144 and 146 seek to have informa- 
tion regarding the customer code and the total 
20 sales for their respective months. In order to cal- 
culate the total sales of the last month, it is neces- 
sary to determine the calculation object (i.e.. what 
kind of information is being sought). In addition, it 
is necessary to determine the amount of orders 
25 that were received during the month from that 
customer. Accordingly, there is an additional table, 
the total sales of the last month's intermediate 
result table 148. Analogously, a total sales in this 
month's intermediate result table 151 that seeks 
30 similar information for this month's sale, is also 
provided. Hence, the amount of received order for 
this month and last month for the specified cus- 
tomer code are requested and passed to the 
database formula generation unit 32 which converts 
35 the logic formula into a database retrieval formula 
157 using the database retrieval word grammar 
definition table 155. The result table and the var- 
ious intermediate result tables 144, 146, 148 and 
151 are passed to the database formula generation 
40 unit 32. In addition, equality tables (denoted as EQ 
tables) are passed to the database formula genera- 
tion unit 32. Specifically, EQ Tables 3 and 4, as 
shown in Ftg. 17b, are passed to the database 
formula generation unit 32, EQ Table 3 seeks to 
45 determine if the received order file date is equal to 
the last month date, and EQ Table 4 seeks to 
determine if the received order file date is equal to 
today's date. 

The entity table logic formula 140 is processed 
so by the database formula generation unit 32 (Fig. 
17c) which uses the database retrieval word gram- 
mar definition table to process the logic formula 
140. The database retrieval word grammar defini- 
tion table is examined by the database formula 
55 generation unit 32 with respect to the retrieval logic 
formula 140. The database retrieval word definition 
table initially processes result table as indicated in 
Fig. 18. In particular, the system is directed to 
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select the SELECT (item) FROM (reference table) 
WHERE (condition). Thus, the result table is con- 
verted into a database retrieval formula of 
interrogation 3> of Fig. 19. The retrieval word 
grammar definition table 155 has a similar entry for 5 
the intermediate result tables 144 and 146. Further, 
the database formula generation unit 32 investi- 
gates the executing order of the specified oper- 
ations with respect to another. In this case, since 
the result table 142 designates last month's inter- 10 
mediate result table 144 and this month's inter- 
mediate result table 146 as "left side > right side" 
in the GT table, it is learned that the operation of 
left side and right side must be performed before 
the GT table can be processed. In other words, it is 15 
seen that determination of the intermediate result 
tables must be performed first. 

In this manner the execution order is deter- 
mined as interrogation 1>, < interrogation 2> 
(there is no restriction on the executing order of 20 
these two), < interrogation 3>. 

The system proceeds to process each of the 
interrogations as indicated in Fig. 19. In particular, 
for interrogation 1 , which is interrogation for the last 
month's intermediate result table, the customer ta- 25 
ble in the database 9 (Fig. 17c) is retrieved using 
retrieval unit 34 to obtain the customer code in- 
formation. Furthermore, the system seeks to sum 
the amount fields in the received order file of the 
database 9. In order to perform this calculation, the 30 
system sums the amount entries having the appro- 
priate customer code and which meet the date 
limitations of last month. The EQ table 3 is used to 
ensure that the date requirements are fulfilled. In ' 
this fashion, the intermediate result table is filled in 35 
with the relevant information. 

Interrogation 2 involves the processing for this 
month's intermediate result table. The processing 
is the same as interrogation 1 except that different 
date requirements are utilized. Specifically, the 40 
date must correspond to the limitations for this 
month. In this fashion, the information for this 
month's intermediate result table is completed. 

Lastly, interrogation 3 is processed. The inter- 
rogation 3 is the interrogation for the result table. 45 
As Fig. 19 indicates, the customer table in the 
database customer and name are selected, as are 
the total sales of this last month table and the total 
sales of this month table. This information is re- 
trieved from the customer table in the database 9 50 
(Fig. 17c) and from the last month's intermediate 
result table 144 (Fig. 17b) and this month's inter- 
mediate result table 146. In order for the customer 
name to be output, the sales of this month table 
must be greater than the sales of last month table 55 
and the customer code of this month's intermediate 
result table must equal the customer table and 
code. 



In this manner, automatic generation of 
database retrieval formula is possible. Operations 
are connected by means of pointer and a logic unit 
for judging executing order is provided in the 
database formula generator unit 32 in Fig. 6. 

Further, this approach provides the additional 
advantage a plurality of sequenced data retrievals 
are possible by way of intermediate results. The 
system also provides the advantage that it is possi- 
ble to readily conform to a different database re- 
trieval language by altering the grammar definition 
table. 

Specifically, when the retrieval language is 
changed, the database retrieval formula for a new 
retrieval language may be generated and an exten- 
sive rewriting thereof is not necessary. Rather, a 
simple change in the description of (item), 
(reference table), (condition) or SELECT, FROM, 
WHERE of the designated item to the result table 
of the grammar definition table is all that is re- 
quired. 

For some natural Japanese queries, a com- 
plicated or plurality of processing must be per- 
formed to analyze the query. ' For example, there 
are instances where data conforming to specific 
periods of specific conditions are added together, ft 
is often desirable to be able to perform a 
preprocessing operation at the collating unit for 
comparison or grouping. Hence, such preproces- 
sing may be incorporated into the present inven- 
tion. 

In order to explain such preprocessing, sup- 
pose that the input query is "Mitsubishi shooten no 
uriage yori uriage ga ooi tokuisaki ha" (What a 
customer has more sales than Mitsubishi shooten?) 
or "(A-shooten no) kotoshi no haru kara aki made 
no uriage ha" (How much were the sales to (A 
store) from the spring to fall of this year?). Figs. 
20a and 20b are helpful in explaining the structure 
of a syntax tree that is produced for an input query 
which requires a plurality of logic formula groups. 
First, the input sentence is broken down by the 
parser 22 (Fig. 6) into elements in the form of a 
tree structure (i.e., the syntax tree) such as the tree 
denoted as "HIKAKU" (comparison) in Fig. 20a and 
the tree denoted as "KARA MADE" (from to) in 
Fig. 20b. Fig. 20a shows the syntax tree for the 
first example query, and Fig. 20b shows the syntax 
tree for the second example query. Particles are 
detected and the elements are forcibly divided at 
the parser 22 (Fig. 6). In Figs. 20a and 20b "ji" 
refers to a word serving as a key and "fu" is a 
modifier. The modifier is used to refer to the sur- 
face restriction or is regarded as a special modifier 
in searching the virtual table. 

The first example query, as shown in Fig. 20a, 
seek to compare sales of two entities. As such, two 
tables have to be selected. If a table is selected so 
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that a comparison cannot be made. Two tables can 
be selected by dividing the syntax tree into groups. 
A virtual table (see Fig. 21a) corresponding to a 
comparison expression like the "ooi hyo" shown in 
Fig. 20a is provided and a virtual table logic for- 
mula for comparison is generated by indicating the 
relation between the two tables with the compari- 
son virtual table. The comparison virtual table can 
be used for converting a word indicating a compari- 
son meaning in any language to an expression 
such as, [GT] (greater than). The two virtual table 
logic formulas are set by Group (a) in Fig. 20a. 

In a similar manner, as shown in Fig. 21b, by 
using the virtual table constructed to have "kara 
made" (from ... to), "yori made" (from ... to) tables 
the intermediate logic formulas are determined by 
Group (b) as shown in Fig. 20b. ft is designated at 
Fig. 21b to refer to the definition formula, and 
actual dates are determined by the operation dis- 
cussed above. 

Also, interrogatives may be dealt with to some 
extent by providing an item for surface restriction 
in the virtual table and by investigating the items 
relative to the surface restriction. For example, with 
respect to an input sentence "Nani wo uttaka" 
(What was sold?), since only a commodity name or 
commodity group name falls under those with the 
surface restriction "wo" in "uru hyo", it is possible 
to assume that "nani" (what) refers to one of them. 

Further, by collating surface restriction, it is 
possible to check particle and to display an error 
message for an input sentence with an erroneous 
content. For example, with respect to a sentence 
"Chokoreeto ga utta shoohin ha" (What commodity 
sold by chocolates?), since there is no "ga" in the 
surface restriction of "shoohin" in "uru hyo", it is 
judged as an error and it is possible to display an 
error message "Zyoshi ga chigai masu" (Wrong 
"zyoshi" is used). 

In the system of Fig. 1 described as a conven- 
tional example, an answer is provided in the same 
format at all times. That is, in answering the re- 
trieval result, the response is made in a tabular 
format and not in a sentence format. In some 
cases, the answer in this format is difficult to view. 
To eliminate this disadvantage, a response format 
selection unit may be provided in the retrieval unit. 
This unit should provide at least two types of 
formats, i.e., a tabular format and sentence format, 
as the outputting format. 

While the present invention has been shown 
with respect to preferred embodiments thereof, 
those skilled in the art will know of other alternative 
embodiments which do not depart from the spirit 
and scope of the invention as defined in the appen- 
ded claims. For instance, the system may be ad- 
justed to operate on natural language queries that 
are formulated in languages other than Japanese. 



Further, the system may be implemented on data 
processing system other than that shown in Fig. 2. 

Claims 

5 

1. An information retrieval system for retrieving 
information from a database, comprising: 

a parser for parsing a natural language 
query into its constituent phrases to produce a 

io syntax analysis result; 

virtual table for converting phrases of the 
natural language query to retrieval keys pos- 
sessed by the database, said virtual table ac- 
counting for particles that modify the phrases; 

is a collating unit for preparing a database 

retrieval formula from the syntax analysis result 
by selecting a virtual table that is used to 
convert the phrases of the natural language 
query to keys possessed by the database; and 

20 a retrieval execution unit for retrieving data 

from the database on the basis of said 
database retrieval formula. 

2. An information retrieval system as recited in 
25 Claim 1 further comprising; 

an additional table for converting an un- 
determined value phrase in the natural lan- 
guage query into a determined value phrase in 
the database based on the syntax analysis 
30 result. 

3. An information retrieval system as recited in 
Claim 1 further comprising; 

a terminology dictionary for identifying en- 
35 tries in the virtual table to be used in convert- 

ing the phrases of natural language query, said 
dictionary including words representing time, 
and said terminology dictionary being used by 
the parser in obtaining the syntax analysis 
40 result; and 

a time interval definition table in the virtual 
table for defining dates corresponding to said 
words representing time. 

45 4. An information retrieval system as recited in 
Claim 1 further comprising; 

a database retrieval formula conversion 
unit for generating a formula in a database 
retrieval language from the database retrieval 

so formula. 

5. An information retrieval system for retrieving 
information from a database, comprising: 

a parser for parsing a natural language 
55 query into its constituent phrases to produce a 

syntax analysis result 

virtual table for converting phrases of the 
natural language query to retrieval keys pos- 
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sessed by the database, said virtual table ac- 
counting for particles that modify the phrases; 

a collating unit for preparing a database 
retrieval formula from the syntax analysis result 
by selecting a virtual table that is used to 5 
convert the phrases of the natural language 
query to keys possessed by the database; 

a retrieval execution unit for retrieving data 
from the database on the basis of said 
database retrieval formula; io 

an additional table for converting an un- 
determined value phrase in the natural lan- 
guage query into a determined value phrase in 
the database based on the syntax analysis 
result; 75 

a terminology dictionary for identifying en- 
tries in the virtual table to be used in convert- 
ing the phrases of natural language query, said 
dictionary including words representing time, 
and said terminology dictionary being used by 20 
the parser in obtaining the syntax analysis 
result; 

a time interval definition table in the virtual 
table for defining dates corresponding to said 
words representing time; and 25 

a database retrieval formula conversion 
unit for generating a formula in a database 
retrieval language from the database retrieval 
formula. 
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